flate: objects allocation optimization #1106

RXamzin · 2025-10-06T19:41:36Z

After updating GO to v1.24+, a sharp increase in CPU utilization was detected. Heap profile helped to reveal increased memory allocations by Write and Close methods of stateless gzip.Writer mode. This PR optimizes problem area by using sync.Pool and later allocation of tokens object.

Benchmarks:

BEFORE

BenchmarkEncodeDigitsSL1e4-12              10141            115946 ns/op          86.25 MB/s      542379 B/op          3 allocs/op
BenchmarkEncodeDigitsSL1e5-12               1602            730674 ns/op         136.86 MB/s      541377 B/op          2 allocs/op
BenchmarkEncodeDigitsSL1e6-12                175           6851506 ns/op         145.95 MB/s      541542 B/op          2 allocs/op
BenchmarkEncodeTwainSL1e4-12                9708            131564 ns/op          76.01 MB/s      542146 B/op          3 allocs/op
BenchmarkEncodeTwainSL1e5-12                1663            684854 ns/op         146.02 MB/s      541463 B/op          2 allocs/op
BenchmarkEncodeTwainSL1e6-12                 177           6435648 ns/op         155.38 MB/s      541654 B/op          2 allocs/op

AFTER

BenchmarkEncodeDigitsSL1e4-12              34747             33800 ns/op         295.86 MB/s           8 B/op          0 allocs/op
BenchmarkEncodeDigitsSL1e5-12               1771            640723 ns/op         156.07 MB/s         160 B/op          0 allocs/op
BenchmarkEncodeDigitsSL1e6-12                181           6759226 ns/op         147.95 MB/s        1573 B/op          0 allocs/op
BenchmarkEncodeTwainSL1e4-12               35294             35304 ns/op         283.26 MB/s           8 B/op          0 allocs/op
BenchmarkEncodeTwainSL1e5-12                1939            585755 ns/op         170.72 MB/s         146 B/op          0 allocs/op
BenchmarkEncodeTwainSL1e6-12                 181           6505389 ns/op         153.72 MB/s        1573 B/op          0 allocs/op

Summary by CodeRabbit

Refactor
- Optimized compression internals to reuse buffers via pooling, improving throughput and reducing memory use during repeated operations.
- Enhances performance and consistency for both dictionary and non-dictionary compression paths across large blocks.
- No changes to public APIs or user-facing behavior; workflows remain the same.
- Users may see faster compression and lower memory footprint under sustained/high-volume workloads.

coderabbitai · 2025-10-06T19:41:48Z

📝 Walkthrough

Walkthrough

Introduces a pooled tokens object (tokensPool) in flate/stateless.go. Replaces a stack-allocated dst with pooled instances, ensuring Reset on reuse and deferred return to the pool. Updates statelessEnc calls to pass dst by value in both no-dict and with-dict paths, integrating pooling into the compression loop.

Changes

Cohort / File(s)	Summary of Changes
Stateless tokens pooling and call adjustments `flate/stateless.go`	Added internal `tokensPool` for reusing `*tokens`; replaced local `dst` with pooled instance, call `Reset()` on reuse and `defer Put(dst)`; updated `statelessEnc` invocations to pass `dst` by value in both no-dict and dict branches; integrated pooling into the compression loop while keeping existing `bitWriterPool` usage.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Caller as compressLoop
    participant Pool as tokensPool
    participant Enc as statelessEnc
    participant Dict as optional dict

    Caller->>Pool: Get() -> dst
    note right of Pool #e6f7ff: pooled *tokens returned
    Pool-->>Caller: dst
    Caller->>Caller: dst.Reset()

    alt no dict
        Caller->>Enc: statelessEnc(dst, src)
        Enc-->>Caller: compressed block produced
    else with dict
        Caller->>Dict: access dict
        Caller->>Enc: statelessEnc(dst, src, dict)
        Enc-->>Caller: compressed block produced
    end

    Caller->>Pool: Put(dst) (deferred)
    note right of Caller #f0ffe6: dst returned to pool after block processing

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title “flate: objects allocation optimization” succinctly captures the primary change of reducing object allocations in the flate package by introducing pooling without extraneous details. It accurately reflects the intent to optimize allocations, is clear and concise, and will be meaningful to teammates reviewing history.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9479a36 and f9a02fb.

📒 Files selected for processing (1)

flate/stateless.go (4 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

flate/stateless.go

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

flate/stateless.go (1)
64-69: Consider using any for consistency.

The tokensPool correctly follows the sync.Pool pattern established by bitWriterPool. However, for consistency with line 59, consider using any instead of interface{} as the return type.

Apply this diff:
 // tokensPool contains tokens struct objects that can be reused
 var tokensPool = sync.Pool{
-	New: func() interface{} {
+	New: func() any {
 		return &tokens{}
 	},
 }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d10b525 and 9479a36.

📒 Files selected for processing (1)

flate/stateless.go (4 hunks)

🔇 Additional comments (3)

flate/stateless.go (3)

100-104: LGTM! Proper sync.Pool usage pattern.

The pooling implementation correctly follows best practices:

Obtains a tokens instance from the pool

Resets it immediately to ensure clean state

Uses defer to guarantee return to the pool even on early returns or errors

This change aligns with the PR's goal of reducing allocations, as confirmed by the benchmark improvements showing dramatic allocation reductions (from ~542KB/op down to 8 B/op in some cases).

128-128: LGTM! Consistent usage of pooled dst.

The changes correctly adapt to dst being a pooled *tokens instance:

Lines 128, 130: statelessEnc calls pass dst directly (not &dst), matching the function signature at line 176

Line 144: writeBlockDynamic receives dst as *tokens

Line 150: dst.Reset() correctly prepares the pooled instance for reuse in the next loop iteration

The integration maintains correctness while enabling the allocation optimizations described in the PR.

Also applies to: 130-130, 144-144, 150-150

64-69: No other tokens allocations found. All pointer instantiations of tokens in production code occur via tokensPool; other occurrences are value declarations or in tests and don’t require pooling.

Optimize tokens objects allocation

9479a36

coderabbitai bot reviewed Oct 6, 2025

View reviewed changes

.

f9a02fb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

flate: objects allocation optimization #1106

flate: objects allocation optimization #1106

Uh oh!

RXamzin commented Oct 6, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 6, 2025 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

flate: objects allocation optimization #1106

Are you sure you want to change the base?

flate: objects allocation optimization #1106

Uh oh!

Conversation

RXamzin commented Oct 6, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

RXamzin commented Oct 6, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 6, 2025 •

edited

Loading